biplotEZ

User-friendly biplots in R



Centre for Multi-Dimensional Data Visualisation (MuViSU)
muvisu@sun.ac.za



SASA 2024

Introduction

The biplotEZ package aims to provide users with EZier software to construct biplots.

. .

What is a biplot?

Visualisation of multi-dimensional data in 2 or 3 dimensions.

. .

A brief history of biplots and biplotEZ

History (1)

1971

Gabriel, K.R., The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58(3), pp.453-467.

1976

Prof Niël J le Roux presents a seminar on biplots.

Photo NJ_hist

History (2)

1996

John Gower publish Biplots with David Hand.

Photo GH_hist

Prof le Roux introduces a Masters module on Biplots (Multidimensional scaling).

Rika Cilliers obtains her Masters on biplots for socio-economic progress under Prof le Roux.

History (3)

1997

SASA conference paper: S-PLUS FUNCTIONS FOR INTERACTIVE LINEAR AND NON-LINEAR BIPLOTS by SP van Blerk, NJ le Roux & S Gardner.

2001

Sugnet Gardner (Lubbe) obtains her PhD on biplots under Prof le Roux.

Photo SL_hist

History (4)

2001

Louise Wood obtains her Masters on biplots for socio-economic development under Prof le Roux.

2003

Adele Bothma obtains her Masters on biplot for school results under Prof le Roux.

2007

Idele Walters obtains her Masters on biplots for exploring the gender gap under Prof le Roux.

History (5)

2008

Ryan Wedlake obtains his Masters on robust biplots under Prof le Roux.

2009

BiplotGUI for Interactive Biplots, Anthony le Grange.

2010

André Mostert obtains his Masters on biplots in industry under Prof le Roux.

History (6)

2011

John Gower, Sugnet Lubbe and Niël le Roux publish Understanding Biplots.

Photo UB_hist

R package UBbipl developed with the book, but never published.

History (7)

2013

Hilmarie Brand obtain her Masters on PCA and CVA biplots under Prof le Roux.

2014

Opeoluwe Oyedele obtains her PhD on Partial Least Squares Biplots under Sugnet Lubbe.

2015

Ruan Rossouw obtains his PhD on using biplots for multivariate process monitoring under Prof le Roux.

2016

Ben Gurr obtains his Masters on biplots for crime data under Prof le Roux.

History (8)

2019

Johané Nienkemper-Swanepoel obtains her PhD on MCA biplots under Prof le Roux and Sugnet Lubbe.

Photo JN_hist

Carel van der Merwe obtains his PhD using biplots. Carel supervises 4 Master’s projects on biplots.

  • Justin Perrang, Francesca van Niekerk, David Rodwell, Delia Sandilands

History (9)

2020

Raeesa Ganey obtains her PhD on Principal Surface Biplots under Sugnet Lubbe.

Photo RG_hist

André Mostert obtains his PhD on multidimensional scaling for identification of contributions to out of control multivariate processes under Sugnet Lubbe.

History (10)

2020

Adriaan Rowen obtains his Master’s using biplots to understand black-box machine learning models.

2022

Zoë-Mae Adams obtains her Masters on biplots in sentiment classification under Johané Nienkemper-Swanepoel.

Photo ZA_hist

History (11)

2023

bipl5 for Exploding Biplots, Ruan Buys.

2024

Ruan Buys obtains his Masters on Exploding biplots under Carel van der Merwe.

Photo RB_hist

History (12)

2024

Adriaan Rowen to submit his PhD using biplots to understand black-box machine learning models.

Peter Manefeldt to submit his Masters using multidimensional scaling for interpretability of random forest models.

Photo PM_hist

The Team

Photo 1

Photo 2

Photo 3

Photo 4

Photo 5

Photo 6

Photo 7

More biplots

  • CVA biplots for two classes
  • Regression biplot
  • Spline biplot

CVA biplots for two classes

Canonical space of dimension 1.

Solve \(\mathbf{BM=WM\Lambda}\) where \(\mathbf{M} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\\ \end{bmatrix}\)

\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}} \mathbf{M} = \begin{bmatrix} \bar{y}_{11} & 0 & \dots & 0 \\ \vdots & \vdots & & \vdots\\ \bar{y}_{K1} & 0 & \dots & 0 \\ \end{bmatrix} \]

\[ \mathbf{\Lambda} = diag(\lambda, 0, ..., 0) \] Total squared reconstruction error for means: \(TSREM = tr\{ (\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})(\bar{\mathbf{X}}-\hat{\bar{\mathbf{X}}})'\} = 0\)

Total squared reconstruction error for samples: \(TSRES = tr\{ ({\mathbf{X}}-\hat{{\mathbf{X}}})({\mathbf{X}}-\hat{{\mathbf{X}}})'\} >0\)

CVA biplots for two classes

Minimise \(TSRES\) (Default option)

Alternative option: Maximise Bhattacharyya distance. For more details see

  • le Roux, N. and Gardner-Lubbe, S., 2024. A two-group canonical variate analysis biplot for an optimal display of both means and cases. Advances in Data Analysis and Classification, pp.1-28.

\[ \mathbf{M}^{-1} = \begin{bmatrix} \mathbf{m}^{(1)} \\ \mathbf{M}^{(2)}\\ \end{bmatrix} \]

\[ \mathbf{M}^{(2)}\mathbf{M}^{(2)'} = \mathbf{UDV}' \]

\[ \mathbf{M}_{opt} = \begin{bmatrix} \mathbf{m}_1 & \mathbf{M}_2\mathbf{V}\\ \end{bmatrix} \]

CVA biplots for two classes

biplot (iris[51:150,]) |> CVA (classes = iris[51:150,5]) |> means (cex=2) |>
  axes (label.dir = "Hor", label.line=c(0.8,0,0,0)) |> plot ()

Regression biplot

Any 2D representation of sample points, for example

library (MASS)
Zmat <- sammon (dist(iris[-102,1:4], method="manhattan"))$points
# Initial stress        : 0.01116
# stress after  10 iters: 0.00833, magic = 0.018
# stress after  20 iters: 0.00614, magic = 0.213
# stress after  30 iters: 0.00561, magic = 0.500
# stress after  40 iters: 0.00558, magic = 0.500

To create a biplot we need to add information on the variables.

\[ \mathbf{X}:n \times p \]

\[ \mathbf{Z}:n \times 2 \]

\[ \mathbf{X = ZB + E} \]

\[ \mathbf{B = (X'X)}^{-1}\mathbf{X'Z} \]

Regression biplot

biplot (iris[-102,]) |> regress (Zmat) |>  plot ()

Spline biplot

Are linear axes a good representation when the transformation from \(\mathbf{X}:n \times p\) to \(\mathbf{Z}:n \times 2\) is nonlinear?

Replace linear regression with splines.

Spline biplot

biplot (iris[-102,1:4]) |> regress (Zmat, axes="splines") |>  plot ()
# Calculating spline axis for variable 1 
# Calculating spline axis for variable 2 
# Calculating spline axis for variable 3 
# Calculating spline axis for variable 4